This is an academic research by apply R statistics analysis to an agency A of an existing betting consultancy firm A. According to the Dixon and Pope (2004)1 Kindly refer to 24th paper in Reference for industry knowdelege and academic research portion for the paper. in 7.4 References, due to business confidential and privacy I am also using agency A and firm A in this paper. The purpose of the anaysis is measure the staking model of the firm A. For more sample which using R for Soccer Betting see http://rpubs.com/englianhu. Here is the references of rmarkdown and An Introduction to R Markdown. You are welcome to read the Tony Hirst (2014)2 Kindly refer to 1st paper in Reference for technical research on programming and coding portion for the paper. in 7.4 References if you are getting interest to write a data analysis on Sports-book.
Before we start modelling, we look at the summary of investment return rates.
table 4.1.1 : 5 x 5 : Return of annually investment summary table.3 Kindly refer to the list of colors via Dark yellow with hexadecimal color code #9b870c for plot the stylist table.
\[\Re = \sum_{i=1}^{n}\rho_{i}^{EM}/\sum_{i=1}^{n}\rho_{i}^{BK} \cdots equation 4.1.1\]
\(\Re\) is the return rates of investment. The \(\rho_i^{EM}\) is the estimated probabilities which is the calculated by firm A from match 1,2… until \(n\) matches while \(\rho_{i}^{BK}\) is the net/pure probability (real odds) offer by bookmakers after we fit the equation 4.1.2 into equation 4.1.1.
\[\rho_i = P_i^{Lay} / (P_i^{Back} + P_i^{Lay}) \cdots equation 4.1.2\]
\(P_i^{Back}\) and \(P_i^{Lay}\) is the backed and layed fair price offer by bookmakers.
We can simply apply equation above to get the value \(\Re\). From the table above we know that the EMPrice calculated by firm A invested at a threshold edge (price greater) 1.0769894, 1.1072203, 1.0781056, 1.1148426, 1.0671108 than the prices offer by bookmakers. There are some description about \(\Re\) on Dixon and Coles (1996)4 Kindly refer to 25th paper in Reference for industry knowdelege and academic research portion for the paper. under 7.4 References. The optimal value of \(\rho_{i}\) (rEMProbB) will be calculated based on bootstrapping/resampling method in section 4.3 Kelly Ⓜodel.
table 4.1.2 : 48640 x 45 : Odds price and probabilities sample table.
Above table list a part of sample odds prices and probabilities of soccer match \(i\) while \(n\) indicates the number of soccer matches. We can know the values rEMProbB, netProbB and so forth.
graph 4.1.1 : A sample graph about the relationship between the investmental probabilities -vs- bookmakers’ probabilities.
Graph above shows the probabilities calculated by firm A to back against real probabilities offered by bookmakers over 48640 soccer matches.
Now we look at the result of the soccer matches.
table 4.1.3 : 7 x 8 : Summary of betting results.
The table above summarize the stakes and return on soccer matches result. Well, below table list the handicaps placed by firm A on agency A. I list the handicap prior to test the coefficient according to the handicap in next section 4.2 Linear Ⓜodel.
table 4.1.4 : 6 x 8 : The handicap in sample data.
From our understanding of staking, the covariates we need to consider should be only odds price since the handicap’s covariate has settled according to different handicap of EMOdds.
Again, I don’t pretend to know the correct Ⓜodel, here I simply apply linear model to retrieve the value of EMOdds derived from stakes. The purpose of measure the edge overcame bookmakers’ vigorish is to know the levarage of the staking activities onto 1 unit edge of odds price by firm A to agency A.
table 4.2.1 : Summary of linear models.
table 4.2.2 : Anova of linear models.
Due to the coefficient of the regression models ocuppied a long space in this article. Here I write in a shiny app which perform the summary and anova of the models. Kindly click on Report with ShinyApps to use the ShinyApp. It will take few minutes time to open the website since the dataset contain 5 years soccer data from firm A staking via agent A made it a heavily loading.
When I used to work in 188Bet and Singbet as well as AS3388, we know from the experience which is the odds price of favorite team win will be the standard reference and the draw odds will adjust a little bit while the underdog team will be ignore.
Steven Xu (2013)5 Kindly refer to 16th paper in Reference for industry knowdelege and academic research portion for the paper. has do a case study on the comparison of the efficiency of opening and closing price of NFL and College American Football Leagues and get to know the closing price is more efficient and accurate compare to opening price nowadays compare to years 1980~1990. It might be due to multi-million dollars of stakes from informed traders or smart punters to tune up the closing price to be likelihood.
In order to test the empirical clichés, I used to conduct a research thoroughly through ®γσ, Eng Lian Hu (2016)6 Kindly refer to 3rd paper in Reference for industry knowdelege and academic research portion for the paper. under 7.4 References, I completed the research on year 2010 but write the thesis in year 2016. and concludes that the opening price of Asian Handicap and also Goal Lines of 29 bookmakers are efficient than mine. However in my later ®γσ, Eng Lian Hu (2014)[Kindly refer to 4th paper in Reference for industry knowdelege and academic research portion for the paper. under 7.4 References] applied Kelly staking model where made a return of more than 30% per sesson. Meanwhile, the Dixon and Coles (1996) and Crowder, Dixon, Ledford and Robinson (2001)7 Kindly refer to 27th paper in Reference for industry knowdelege and academic research portion for the paper. under 7.4 References has built two models which compare the accuracy of home win, draw and away win. From a normal Poison model reported the home win is more accurate and therefore an add-hoc inflated parameter required in order to increase the accuracy of prediction. You are feel free to learn about the Dixon and Coles (1996) in section 4.4 Poisson Ⓜodel.